Mapping Memphis

Let’s set some boundaries. Using the R packages tigris, sf, tidyverse, and leaflet, I created an interactive map of Memphis Census boundaries.

Intro

My overall goal for this blog is to study and understand housing in Memphis. To do that, I need to know some basic things, such as how many people and homes exist within the city. This information (and much more) is publicly available from the US Census. But before collecting and analyzing any data, it’s important to understand Census boundaries. In this post I will create a large interactive map, allowing users to see and compare boundaries relevant to Memphis. I also provide the code to download these boundaries using R packages.

Census Boundaries

Every ten years, the Census attempts to count every person living in the US, as required by the Constitution. Data is collected on individuals, households, and housing units and then that data is made public and used to determine important things like political boundaries.

Due to the personal nature of the data, it is anonymized and not released at too fine of a geographic level.1 Instead, housing units are categorized into Census blocks, roughly equivalent to a neighborhood block. These blocks are then used to form other boundaries. Some boundaries nest within each other, like blocks into block groups into tracts, while others do not.

Below is a diagram of Census geographic hierarchies, available at https://www.census.gov/programs-surveys/geography/guidance/hierarchy.html.

Census Geographic Hierarchy

Boundary data is available from the Census TIGER/Line Shapefiles page. Datasets are also easily accessed in R using the tigris package.

The state level is often the smallest geography for which boundaries can be downloaded, meaning files can be very large. One benefit of using R is the ability to easily specify and filter for only the data you need. Geographic data can also be easily turned into an interactive map, allowing users options like zooming or turning layers on and off.

I’ve always wanted to create a map layering Census geographies for Memphis so I can better spot the differences. Thanks to a handful of R packages, I’m able to make this a reality.

Memphis Boundaries Map

Below is an interactive map of boundaries related to the Memphis area.

I decided to not include boundaries from the national-state level, instead focusing just on the Memphis region. The broadest geography is the Memphis-Forrest City combined statistical area (CSA), which includes the Memphis Metro Area and Forrest City Micro Area. Metro and Micro areas are collectively known as core-based statistical areas (CBSAs), and are created using counties with “a high degree of social and economic integration” (measured by commute to work).2 Because Forrest City is relatively small, there is only one county difference between the CSA and CBSA boundaries. I plan to use the CBSA/Metro boundaries more often, which is why that layer is visible by default.

There are a few more boundaries that may have unfamiliar abbreviations for the average person. ZCTAs, which stands for ZIP Code Tabulation Areas, are approximate Census boundaries of USPS ZIP Codes and are not bound by county, place, tract, or blocks. PUMAs, or Public Use Microdata Areas, are boundaries set every 10 years which contain at least 100,000 people. They are notably used with Public Use Microdata Sample (PUMS) data, which are anonymized individual-level Census records.3 This data is useful to researchers who want to create custom queries of data rather than using pre-tabulated estimates provided by the Census Bureau.

Layers will stack in the order they are selected.

Figure 1: Interactive leaflet map of Memphis boundaries

Making this map gave me a greater understanding of all the different ways to look at Memphis and how boundaries differ. For a while I have been interested in using PUMS data, but I’ve been partially held back by not understanding local PUMAs boundaries. I have also never used ZCTA data, and now I know if I do use it in the future to be mindful that ZCTAs often cross county borders. School districting is also a hot topic in Shelby County, so it’s useful to quickly see which suburbs have set up their own municipal school districts.

The Code

This post, the map, and all data collection was completed using R/RStudio. Boundaries were pulled using the tigris package. A full list of available datasets and their corresponding functions is available in Chapter 5 of the book Analyzing US Census Data.

The Memphis Metro area covers three states, meaning sometimes I needed to run the same function three times for Tennessee, Arkansas, and Mississippi. I used purrr’s map_dfr() to do this automatically. Boundaries were then filtered in one of two ways, either using dplyr’s filter() function for table data or sf’s st_filter() to filter one geometry by another.

In summary, these are the packages I used to get started (I also recommend caching your datasets, but it’s optional).

library(tidyverse)
library(sf)
library(tigris)
options(tigris_use_cache = TRUE)

Below is a full list of how I gathered boundaries related to Memphis, grouped by how they’re filtered.

#' Broad regions, filtered by known NAME/DIVISIONs relevant to Memphis Area
nation <- nation()
regions <- regions() %>% 
  filter(NAME == "South")
divisions <- divisions() %>% 
  filter(str_detect(NAME, "South Central"))
states <- states() %>% 
  filter(DIVISION %in% c(6, 7))
memST <- list("TN", "MS", "AR")

#' Metro and Urban areas, filtered by "Memphis"
csa <- combined_statistical_areas() %>% 
  filter(str_detect(NAME, "Memphis")) 
cbsa <- core_based_statistical_areas() %>% 
  filter(str_detect(NAME, "Memphis")) 
urb <- urban_areas() %>% 
  filter(str_detect(NAME10, "Memphis"))

#' Areas in Metro or Urban area
csaCounties <- map_dfr((memST), ~{counties(.x)}) %>% 
  st_filter(csa, .predicate = st_within)
counties <- csaCounties[cbsa, op = st_within]
places <- map_dfr((memST), ~{places(.x)}) %>% 
  st_filter(cbsa, .predicate = st_within)
sld <- state_legislative_districts("TN") %>% 
  st_filter(cbsa, .predicate = st_within)
zcta <- zctas(cb = TRUE, starts_with = c("38", "72")) %>% 
  st_filter(urb)
puma <- map_dfr((memST), ~{pumas(.x)}) %>% 
  st_filter(urb)

#' Shelby County and Memphis
shelby <- counties %>% filter(NAME == "Shelby")
memphis <- places %>% filter(NAME == "Memphis")

#' Areas within Shelby County
school <- school_districts("TN") %>% 
  st_filter(shelby, .predicate = st_within)
cong <- congressional_districts("TN") %>% 
  st_filter(shelby, .predicate = st_within)
vote <- voting_districts("TN", "Shelby")
csd <- county_subdivisions("TN", "Shelby")
tracts <- tracts("TN", "Shelby")
blkgrp <- block_groups("TN", "Shelby")

The above code may look like a lot, but running it will quickly pull from 18 datasets and automatically filter for the Memphis area. Of course, unless you’re also trying to map all the boundaries at once, there’s no reason to do this. Instead, the above code chunk is meant to be used as a reference, to quickly copy and paste whichever specific boundary is needed.

The final map was created using leaflet. The website and package documentation were useful when I got stuck.

Acknowledgements

To make this post, I heavily referenced the book Analyzing US Census Data by Kyle Walker, particularly Chapter 5, which covers basic usage of tigris, and Chapter 7, which explains how to spatially filter data. There is also a more in–depth explanation of Census boundaries in Chapter 1.

Note on CRS

A coordinate reference system (crs), in essence, tells a map how to look. If your map doesn’t look right, like it’s skewed or warped or whatever, you probably need to reangle the map by setting the crs. If you plan to combine maps, defining the crs ensures projections are consistent.

If you do not define a crs, tigris and sf will default to 4269 (NAD 1983). The package crsuggest can help find the correct crs for your map. I also found the website epsg.io useful; for example, you can quickly see the boundaries for the crs code 6510 at https://epsg.io/6510.

This crs was the top suggestion for my maps, but the projection excluded Tennessee and I found the difference minimal from the default. For simplicity, I stuck with the default.

You can check a table’s crs using st_crs(). If you need to adjust the crs of a dataset, one way is to use st_transform(crs = ####). For more info, see the section on crs in Analyzing US Census Data.

US Census Bureau. “Micropolitan and Metropolitan: About,” October 8, 2021. https://www.census.gov/programs-surveys/metro-micro/about.html.

  1. The Census also alters some data to hide personally identifying info. For instance, if someone is the only person of a certain race in a block, the Census may swap that person’s info to another block within the block group to protect their identity. According to the Census, these adjustments should not significantly impact data analysis. For more information, see that Census’s page on statistical safeguards.↩︎

  2. US Census Bureau, “Micropolitan and Metropolitan: About,” October 8, 2021, https://www.census.gov/programs-surveys/metro-micro/about.html.↩︎

  3. Until the API is complete, PUMAs data is available for download from https://www.ipums.org/.↩︎

References